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Title; Iroprovements in or Relating to Mutagenesis of Nucleic Acids 
Field of the Invention 

This invention relates to certain novel compounds* a method of mutating a micieic acid 
sequence involving the novel compound, and to a kit for performing the method of the 
invention. 

Background of the Invention 

In vitro site-directed mutagenesis, which involves the substitution of single amino acids 
in a protein by changing the relevant base residues in the encoding DN A, has proved to 
be a powerful method in protein engineering. This technique typically requires information 
on the strucmre-function relationship of the protein under study in order to provide a 
rationale for generating mutants with altered properties. In contrast, random mutagenesis 
of the DNA region of interest coupled with adequate screening or selection procedures 
provides an alternative and general method for the generation of DNA, RNA or protein 
species with improved or novel funcdons in the absence of initial structural informadon. 

Several methods for the generation of mutants of large DNA fragments have been 
described and involve using pools of random sequence synthetic oligonucleoddes 
(Matteucci & Heyneker, Nucl. Acids Res. 1983 IL 3113; Wells et aL, Gene 1985 54, 
315; Nerr et aL, DNA 1988 7, 127 and references dierein), chemical modification of the 
tai^et sequence (Kadonaga & Knowles, Nucl. Acids Res. 1985 13, 1733; Meyers et al^ 
Science 1985 229, 242, and references described therein); or base misincorporation using 
an error-prone polymerase (Lehtovaara et al. Protein Eng. 1988 2, 63). 

The synthetic oligonucleotide approach is restricted by the length of the DNA amenable 
to chemical synthesis, whilst the chemical approach is often labour intensive. In other 
approaches, random mutations are generated using the polymerase chain reaction (PCR). 
One such method relies exclusively on the intrinsic error frequency of Taq DNA 
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polymerase, resulting in about 0.5 x 10'' mutations per base pair (Zhou, Nucl. Acids Res. 
1991 19, 6052). In an improved variation of this method the target sequence of interest 
is copied under conditions which further reduce the fidelity of DNA synthesis catalysed 
by Taq DNA polymerase e.g. by the addition of the cofactor manganese aiKi by the use 
of high concentrations of magnesium and the relevant deoxynucleoside 
triphosphates (dNTPs - see Leung et al.. Techniques 1989 7, 11). Using the latter 
procedure mutation frequencies in the order of 20 x 10^' mutations per base pair have been 
claimed. 

An alternative approach to PCR-based random mutagenesis is to replace, partially or fully, 
the 5^-triphosphates of the four natural nucleosides by the triphosphates of nucleoside 
analogues which display ambivalent base pairing potential. To our knowledge this 
approach has only been attempted using deoxyinosine triphosphate - dlTP (Spee et al, 
Nucl. Acids Res. 1993 2i, 777; Ikeda cf a/, I. Biol. Chem. 1992 267, 6291). However, 
this analogue is a poor substrate for Taq Polymerase and cannot support DNA synthesis 
when replacing any of the four normal dNTPs. As a result, four separate PCR reactions 
are required containing dITP and three dNTPs in eqtial concentrations together Avith 
limiting concentrations of the fourth dNTP. The four separate PCR products are then 
pooled and cloi^ (Spee et aL, cited above). 

A general feature of the above procedures is that the yield of mutant sequences is low and 
that the pattern of mutations is heavily biased towards transitions (pyrimidine-pyrimidine 
or purine-purine substitutions). In addition, with the last two methods, undesirable base 
additions or deletions occur at an appreciable rate. 

In an alternative approach, it was envisaged that the 5'-triphosphates of a pyrimidine or 
ptirine nucleoside analogue capable of inducing transition mutations in combination with 
other triphosphate analogues capable of causing transversion mutations would allow 
efficient random mutagenesis via PCR. The nucleoside analogues P (Kong Thoo Lin & 
Brown, Nucl. Acids Res. 1989 17, 10373) and K (Brown & Kong Thoo Lin. 
Carbohydrate Res. 1991 276, 129), (structures 1 and 3 respectively, shown in Figure 1) 
have previously been incorporated into oligonucleotides and demonstrate ambivalent base 
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pairing potential, as illustrated for P in Figure 2. That is, P forms base pairs of 
equivalent stability with adenine and guanine. Likewise K forms base pairs with closely 
similar stabilities with thymine and cytosine. In addition, template DNA containing these 
analogues is recognised by polymerases such as Taq polymerase in PCR and Sequenase™ 
in DNA sequencing (Kong Thoo Lin & Brown, Nucleic Acids Res. 1992 20, 5149; 
Kamiya et aL, Nucleosides & Nucleotides 1994 13, 1384; Brown & Kong Thoo Lin, 
Collect. Czech. Chem. Commun. (Special issue), 1990 55, 213). The present inventors 
considered that other analogues, e.g. 2'-deoxy-8-hydroxyguanosine S'-triphosphatc, 
abbreviated as 8-oxodGTP (Pavlov et al. Biochemistry 1994 53, 4695), shown as structure 
5 in Figure 1, might be valuable in this context in order to generate transversion 
mutations. 

Summary of the Invention 

In a first aspect the invention provides a compound having the structure set forth below:- 


where = O, S, N-alkyl; N*-diaIkyl, or N-benzyl X* = triphosphate (PjO,)*-, 
diphosphate (PiO^)^ , thiotriphosphate (PsOjS)*" , or analogues thereof, but not H; and 
= H, NHi, F or OR, where R may be any group, but is preferably H, methyl, allyl or 
allcaryl. 



These compounds have not previously been synthesised. In preferred embodiments X^ is 
uiphosphate. Convenientfy, X» is O, and preferably X^ is H or OH. Typically R is H, 
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methyl, ally I or alkaryl. A compound "dP** which has been synihesised previously (and 
which is outside the scope of the claims) has the general structure above where X* is O. 

is H (such that the compound is not within the scope of the invention), and X^ is OH. 
A novel compound within the scope of the invention, and which represents a preferred 
embodiment thereof, is the triphosphate of dP. termed dPTP. 

The compounds of the present invemion. and dPTP in particular, have imexpected 
properties (some of which are described below) which could not have been predicted from 
the prior an, rendering the invention non-obvious. The compouiKis of the invention may 
act as nucleoside triphosphate analogues (especially where X' is H or OH) and thus have 
a wide range of potential uses, one of which is described in detail below. 

The present invention also relates to the synthesis of hydrogen bond ambivalent purine and 
pyrimidine nucleoside triphosphates and their application in PCR-based random 
mutagenesis, and to the generation of polynucleotide libraries (particularly, large libraries) 
based on an original defined template sequence from which the single species are obtained 
by simple cloning methods. In particular the invention involves the synthesis aiKl use of 
the novel degenerate pyrimidine deoxynucleoside triphosphates of the type shown in 
structure 6, (in Figure 1) together, in preferred embodiments, with analogues of the types 
shown in structure 5 and/or structure 4 (in Figure I). The invention is exemplified using 
a PCR-based system for random mutagenesis of DNA sequences, which employs mixtures 
of the novel triphosphate dPTP (structure 2 in Figure 1, synthesis of which is described 
in detail below), in conjunction with the already known analogue 8-oxodGTP (structure 
5, where R'= NH^), (Mo et at, Proc. Nad. Acad. Sci. USA. 1992 S9, 11021). 

In a second aspect therefore the invention provides a method of mutating a nucleic acid 
sequence, comprising replicating a template sequence in the presence of a nucleotide 
analogue according to the general structure defined above, so as to form non-identical 
copies of the template sequence comprising the nucleotide analogue residue. In a 
preferred embodiment the nucleotide analogue is 6-(2-deoxy-/3-D-erythropentofuranosyI)- 
3,4-dihydro-8H-pyrinudo-[4,5-cl [1,2] oxazine-7-one 5'-triphosphate (abbreviated as 
deoxyP triphosphate or dPTP). 
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It will be apparent to those skilled in the art that slight modifications to the structure of 
dPTP may be effected without substantially disrupting the utility of the compound for use 
in the method of the invention. Accordingly, such slightly modified forms of dPTP may 
be regarded as functional equivalents of dPTP and their use is intended to fall within the 
scope of the invention defined above. Particular examples of such modified forms are 
shown in structure 6 in Figure 1, where X may be S, N-alkyl (particularly N-methyl, N- 
ethyl or N-propyl), N*-dialkyl (e.g. dimethyl, diethyl or dipropyl) or N-bcnzyl (with or 
without substitutions in the benzene ring). The group at position X, when the analogue 
is incorporated into DNA, is thought to project into the major groove of the double helix, 
such that quite bulky groups can be successfully accommodated. With the benefit of the 
disclosure contained herein, and in the publication of Loakes 8l Brown (1995 Nucleosides 
and Nucleotides 14, 291), the above modifications, and possibly others, will be apparent 
to those skilled in the art. 

The "non-identical copies" produced by the method are DNA sequences synthesised from 
a template (arxl thtis may be considered copies thereoO but contain one or more mutations 
relative to the template and so are not identical thereto. Typical mutation frequencies 
attained by the present invention arc in the range 1 to 20%, more particularly 2 to 10%, 
but it will be appparent to those skilled in the art from the information contained herein 
that the mutation frequency can be controlled (which is an advantage of die present 
invention) to set the limit at the desired level. For most purposes however the range of 
1 to 20% for mutation frequency will be preferred. This range is sufficiently high as to 
be reasonably likely to introduce a significant change in the transcription and/or translation 
product, but is not so high as to inevitably abolish whatever desirable characteristics may 
have been possessed by the transcription or translation products of the template sequence. 

Preferably the template sequeiK:e is replicated by a method comprising the use of an 
enzyme, deskably a DNA polymerase without a 3,5'-exonuclease "editing" function, 
conveniently by performance of the Polymerase Chain Reaction (PCR). typically using a 
thermostable enzyme such as Taq polymerase. Conveniently, the template sequence will 
be replicated in the additional presence of the four normal dNTPs (i.e. dATP, dCTP, 
dGTP and TTP). Typically dPTP will be present in substantially equimolar ratio with the 
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majority of the four normal dNTPs (although the relative concentrations may 
advantageously be altered, depending on the number and nature of mutations desired, and 
depending on the presence or absence of other reagents, as described below). 

In preferred embodiments the template sequence will be replicated in the presence of one 
or more additional analogue triphosphates. Desirably such additional analogues will cause 
the introduction of transversion mutations. Suitable examples of desirable analogue 
triphosphates include dKTP and 8-oxodGTP (mentioned above) and O^-eihylthymidine 
triphosphate (Singer et aL, 1989 Biochemistry 25, 1478-1483) 

Once the non-identical copies of the template sequence have been obtained, these are 
desirably replicated in the presence of the four normal dNTPs (namely dGTP, dCTP, 
dATP and TTP) but in the absence of analogues thereof, to replace the nucleotide 
analogue residues and "establish** the mutations. This second-stage replication may be 
performed in vivo (e.g. by introducing the non-identical copies, inserted into a vector or 
not, into a suitable laboratory organism, such as £. coli or other microorganism, which 
organism will then replace the dPTP residues in the introduced DNA by means of natural 
DNA repair machinery). Preferably however the second suge replication is performed 
in vitro, conveniently by means of PGR. This allows greater control over the number and 
type of mutations sought to be introduced into the DNA sequence and prevents the 
possibility that repair enzymes in the host (compared to performance of the second stage 
replication in vivo) might adversely affect the established mutations. It is fouiKl that the 
method of the present invention confers several advantages over the kiK>wn prior art 
methods of mutagenesis. 

Firstly, it yields a high frequency of sequences carrying point mutations which, for many 
investigative purposes, are the most informative types of mutations. Secondly the method 
produces insertion and deletion mutations only at an insignificant frequency. This is 
important because such mutations cause frame-shifts in coding sequences and so are 
generally imdesirable. In addition, the desired transversion and transition mutations are 
obtained at a high rate, and all possible types of such transition mutations can be obtained. 
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Use of PCR to replicate the template sequence is especially desirable as it allows control 
of the mutation frequency. The inventors have surprisingly found that there is a 
substantially linear correlation between the mutation frequency and the number of PCR 
cycles performed. This linear relation holds for up to about 30 PCR cycles and may 
extend over a wider range. In addition, further influence on mutation frequency may be 
effected by alteration of the concentradon of deoxynucleoside triphosphates (and/or 
analogues thereoO> 

In summary therefore, the method of the invention differs from those previously described 
in a number of points, including: (i) it yields a high frequency of sequences carrying point 
mutations; (ii) it does not produce insertions and deletions at a significant frequency; (iii) 
it produces relatively high rates of transversion and transition mutations; (iv) all possible 
types of transition mutations, and some types of transversion mutations, can be generated; 
(v) it enables efficient mutagenesis to be conducted in a single DNA amplification reaction 
and (vi) it allows control of the mutational load in the amplified polynucleotide products 
inter alia through cycle number, and deoxynucleoside triphosphate ratios; and vii) it is 
suitable for randomisation of very long sequences (up to several kilobases), which has 
been problematical using prior art methods. Thus the use of appropriate mixtures of 
triphosphate derivatives of nucleoside analogues in accordance with the present invention 
enables highly controlled raiuiom mutagenesis of DNA sequences resulting in nucleotide 
substimtions in any DNA and corresponding amino acid substimtions in the derived 
polypeptides, which cannot efficiently be achieved by existing methods. 

The method of the invention has clear utility in protein engineering. In addition, there is 
increasir^ interest in structure/function relationships in RNA molecules (see, for example 
Bartel & Szostak 1993 Science 26i, 141M418). 

Thus the method will be particularly useful for the construction of libraries of DNA 
sequences directing synthesis of variant transcription (i.e. RNA) or translation (i.e, 
polypeptide) products. In view of the difficulties previotisly presented in connection with 
prior art methods, the present invention will be especially useful in the preparation of 
libraries of long (several kilobases or more) sequences, which are im>i amenable to 
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generation by other random mutagenesis methods. 

In a farther aspect the invention provides a kit for introducing mutations into a nucleic 
acid sequence, comprising dPTP, means for replicating a template sequence in the 
presence thereof so as to incorporate the analogue into non-identical copies of the template 
sequence, and instnictions for performing the method defined above. Conveniently the 
means for replicating the template sequence comprises means for performing the 
polymerase chain reaction (PCR). The kit may advantageously further comprise 8- 
oxodGTP and/or dKTP, and/or O^-ethylthymidine triphosphate. 

In another aspect the invention provides a compound having the structure set forth below: 



OH 


where = OH. O-alkyl, NH^ or NCAlkyl)^; Y^ = H, or NH^; Y^ = triphosphate (PjO,)^ 
diphosphate (PjO^)^, thiotriphosphate (PjOjS)^, or analogues thereof, but not H; and Y* 
= H. NHj* F, or OR where R may be any group but is preferably H. methyl, allyl or 
aUcaiyl. 

In preferred embodiments, is OCH3, Y^ is triphosphate, and Y* is H or OH. A 
particular example of a preferred embodiment is the nucleotide analogue dKTP. Tte 
compounds of the invention have unexpected characteristics and a variety of potential 
applications, particularly as nucleotide analogues. The compounds may be used, for 
example, in a method of mutagenesis, similar to that described above in relation to the 
second aspect of the invention, the preferred features of which are generally common to 
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In yet a further aspect, the invention provides a method of making in vitro a DNA or RNA 
sequence comprising at least one base analogue, the method comprising treating in 
appropriate conditions a mixture comprising the four normal dNTPs (or rNTPs) and a 
novel nucleotide (or ribonucleotide) triphosphate analogue in accordance with the 
invention, with a DNA (or RNA) polymerase in the presence of template nucleic acid 
strand, so as to form a sequence of nucleotides (or ribonucleotides) comprising at least one 
analogue. 

The invention will now be further described by way of example and with reference to the 
accompanying drawings, of which: 

Figure 1 shows the structural formulae of various compounds 1-6; 

Figiu^ 2 illustrates schematically the base-pairing of P with adenine and guanine, the 
ambiguity of which is partly the basis for a powerful transition mutagenic effect; 

Figures 3A and 3B show photographs of gel electrophoresis of PGR products 
demonstrating incorporation into DNA and extension of dPTP (A) and 8-oxodGTP (B) by 
Taq polymerase - A) The PGR reaction mix included: dATP, dGTP, dCTP, TTP in 
sample 1; dGTP, dCTP, TTP, dPTP in sample 2; dATP, dGTP, TTP, dPTP in sample 3; 
dCTP, TTP, dPTP in sample 4; dATP, dGTP, TTP, dPTP in sample 5; dATP, dGTP, 
dCTP, dPTP in sample 6; dATP, dGTP, dPTP in sample 7; dATP, dGTP, dCTP, TTP, 
dPTP in sample 8. All dNTPs were at 500 mM, except in sample 4 and 7 where dPTP 
was at 1 mM. 

B) The PGR reaction mix included: dATP, dCTTP and TTP at 500 ^M. Samples 1 to 4 
contained dGTP at 50/iM, 25 /iM, 12.6 fiM and 6.25 ;iM respectively, and 8-oxodGTP 
at 500 /iM. Samples 5 to 8 contained the same decreasing amounts of dGTP but no 
8-oxodGTP; 

Figure 3G shows a photograph of gel electrophoresis, demonstrating amplification by PGR 
of different target genes in the presence of the four natural dNTPs (lanes 1 to 4); 
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equimolar concentrations of the four normal dNTPs and dPTP (lanes 5 to 8); and 
equimolar concentrations of the four normal dNTPs, dPTP and 8-oxodGTP (lanes 9 to 
12). The template DNA was: human macrophage stimulating protein (MSP) (lanes 1, 5 
and 9); human connexin 31 (lanes 2, 6 and 10); human connexin 43 (lanes 3, 7 and 11); 
or the r chain of human CD3 (lanes 4, 8 and 12). The size of the fragments is indicated 
at the side of the Figure in kilobases. All the fragments were cloiwi in pBluescript and 
DNA amplification was performed using standard T3 and T7 primers; 

Figure 4 shows: A time course of DNA synthesis in the presence of 12.5 P^JdCTP, 
dATP, dGTP and TTP (open diamonds) or dPTP (fall squares). Primed M13mpl8 was 
used as a template for DNA synthesis in the presence of 0.3 U Tag polymerase. Figures 
4 B and C show the rate of DNA synthesis during the fint 80 seconds of the reaction in 
the presence of 12.5 dATP, dGTP and [^]dCTP and the indicated concentrations of 
dPTP (B) and TTP (C); 

Figure 5 shows (Top): Plots of initial velocities against [dNTP] for the incorporation by 
Tag polymerase of dPTP opposite A (A), TTP opposite A (B), dPTP opposite G (Q, and 
dCTP opposite G (D). Below: Primer and templates used in experiments (Seq ID No.s 
3-5); 

Figure 6 shows the frequency of mutation of target DNA after different cycles of 
mutagenesis by PGR. The four normal dNTPs and the analogues were used in equimolar 
amounts (500/iM); 

Figure 7 shows the pattern of mutations produced by dPTP, 8-oxodGTP and the mixture 
of the two. Data obtained after different number of PGR cycles have been pooled and 
figures express percentage of total number of mutations; 

Figure 8 shows a sunmiary of all the point mutations and the relative amino acid 
replacements produced by dPTP in the target DNA sequence MH22 (Griffiths et al,, 1994 
EMBO J. 13, 3245-3260) as shown by sequence analysis of 12 individual clones. 
Numbers at the top of the Table indicate how many times a particular codon is present in 
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the target sequence. Open squares indicate single point mutations within a particular 
codon. Filled circles indicate two point mutations within a particular codon. In no case 
were three base substitutions found within a codon. Squares in shaded areas indicate silent 
mutations; 

Figure 9 is a summary of all the point mutations and the relative amino acid replacements 
produced by 8-oxodGTP in the target DNA sequence MH22 as shown by sequeixx 
analysis of 8 independent clones G^gend as for Figure 8). The point mutation indicated 
with * (C— A) is not normally expected to result from mispairing of 8-oxodGTP; 

Figure 10 shows the codon changes produced by dPTP (circles)* 8-oxodGTP (squares) and 
the combination of the two (triangles). Filled in symbols indicate a single nucleotide 
change within a codon, open symbols denote two nucleotide changes within the same 
codon. Diamonds indicate the presence of nucleotide changes different to those expected. 
Amino acids are grouped in five classes according to their physico-chemical 
characteristics: glycine, non-polar, polar, positively charged, and negatively charged. 
Asterisks denote codons which were not present in the two target genes smdied; and 

Figure 1 1 shows mutations in the target DNA sequence MH22 (within dotted liite, Seq ID 
No. 1, amino acid sequence is Seq ID No. 2) and corresponding amino acid substimtions 
(above dotted line) produced by dPTP (A), 8-oxodGTP (B) and the mixture of the two (C) 
when used in equimolar amount (500 pM) with the four normal dNTPs in a PGR reaction. 
The first number at the 5'-cnd of each sequence indicates how many PGR cycles were 
allowed in the presence of the analogue(s). The second number identifies different clones. 

Example 1 

Synthesis of dPTP, dKTP and 8-oxodGTP 

The 5'-triphosphate derivatives of P and K were prepared by the general procedure 
described by Ludwig (Ludwig, Acta Biochim. et Biophys. Acad. Sci. Hung. 1981 16, 
131). 8-oxodGTP was prepared from dGTP as described (Mo et aL, cited above). 
Purification by anion exchange chromatography (P and K) followed by reverse-phase 
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HPLC (P, K and 8-oxoG) gave the triphosphate samples, judged pure by nmr, ranr 
and HPLC. 

dPTP, dKTP and 8-oxodGTP as substrates for Taq polymerase 
Using Taq Polymerase and a PGR programme of 30 x (92'*C, 1 min; 55"C, 1.5 min; 
72 ''C, 5 min) to amplify a 350 base pair DNA sequence it was found that dPTP could 
completely replace TTP and yield amotmts of product comparable to those obtained using 
the four normal triphosphates (Figure 3A), and could replace dCTP to some extent. In 
contrast, dKTP could replace dATP and dGTP, but only to a limited extent (although such 
a low level of replacement may well be sufRcient to produce a desired level of 
mutagenesis). When 8-oxodGTP was used, some incorporation into DNA and its 
extension could be demonstrated by using normal or limiting amounts of dGTP and by 
compensating with higher concennrations of S-oxodGTP. Figure 3B (lanes 1 to 4) shows 
the PGR product obtained using 500 pM dATP, TTP, dCTP and 8-oxodGTP and 
decreasing amounts of dGTP (from 50 iiM to 6.25 ;iM). Lanes 5 to 8 show die PGR 
products obtained using the same conditions but in the absence of 8-oxodGTP. 

Kinetics of incorporation of dPTP by Taq polymerase 

In order to evaluate the performance of dPTP as a substrate for the enzyme Taq 
polymerase, used in PGR, its rate of incorporation was analysed and compared with TTP 
(siKe initial experimems indicated that its properties best resembled those of this natural 
triphosphate). Figure 4A shows the rate of DNA synthesis in tte presence of dATP. 
dGTP and dGTP plus TTP or dPTP. DNA synthesis was measured by the incorporation 
of [a-^P]dGTP using a primed M13 template at 72*G. Incorporation iiKrreased linearly 
in the first 80 seconds when either dPTP or TTP were present. In order to calculate rates 
of incorporation for different coiu:entrations of substrate, time points were cho^n over 
intervals in which both triphosphate derivatives gave a linear rate of synthesis (Fig. 4B and 
4G). GoiK:entrations lower than 50 /iM had to be used for dPTP (Fig. 4B) because with 
higher concentrations the rate of DNA synthesis did not increase linearly with time. For 
TTP, concentrations between 1.25 and 25 were used to obtain measurable differences 
in rates of incorporation over time (Fig. 4C). The apparent values for TTP and dPTP 
were determined by analysing the experimental data by the direct linear plot method 
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(Eisenihal and Cornish-Bowden, 1974 J. Biochem. /J9, 715). The apparent fordPTP 
under these experimental conditions was 22 ^M, whilst that for TTP was 9.25 fiM. The 
value for dPTP thus compares favourably with those reponed in the literature (Kong et 
al. 1993) for the four natural dNTPs [14 mM - 17 pM]. 

In order to compare the relative efficiencies of insertion of dPTP opposite template 
adenine aiKl guanine residues respectively » we adopted the procedure of Boosalis et al, 
(Boosalis et al., 1987 J. Biol. Chem. 262, 14689-14696) for the determination of steady 
state kinetics using one of two primed synthetic oligonucleotide templates (Fig. 5). The 
'^-labelled primer in each case was extended by the incorporation of dGTP at two 
positions, followed by dPTP (template 1 and 2), TTP (template 1) or dCTP (template 2). 
Separation of the products by PAGE followed by quantitation of the radioactivity using 
a Phosphorlmager allowed the determinadon of the initial velocities (Boosalis et al., 
1987). Due to the very high extension rate of Taq polymerase, the kinetic parameters 
were determined at 55 ""C. The velocities for the insertion of the particular triphosphate 
opposite template and values (/iM) for particular insertions were determined 

from non-linear regression fitting to the Michaelis-Menten equation. 

Plots of V versus substrate concentration [S] are illustrated for the four possibilities PA, 
TA, PG and CO in Figures 5A-D respectively and the kinetic parameters and catalytic 
efficiencies (V^/ATJ are given in Table 1. The results indicate that dPTP is virtually 
indistinguishable from TTP in terms of its recognition by Taq polymerase. Furthermore, 
it is incorporated approximately three times more efficiently opposite template adenine 
than guanine residues. 

K„ values have been reported for 8-oxodGTP with Exoli DNA polymerase I Klenow 
fragment (Purmal et al. , 1994) using a procedure analogous to that described here. Values 
of 63 and 58 ijM for insertion opposite C and A respectively were obtained at 37 *C and 
compare with an average value of approximately 1 /xM for the normal dNTPs (Purmal et 
al., 1994 Nucl. Acids Res. 22, 3930-3935). In addition, the analogue is a substrate for 
the thermostable Tth DNA polymerase and has been shown to generate A-<: transversions 
at a rate of about 1% (Pavlov et al., 1994 Biochem. 5J, 4695-4701). 
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Table 1 


1 

Template 

Substrate 

V 

nut 

K„ (mM) 


Relative 
Efficiency 

A 

dPT? 

0.86±0.06 

5.2 + 1.5 

16.5 X 10* 

0.99 

G 

dPTP 

0.69+0.08 

12.1+4.8 

5.7 X 10* 

0.11 

A 

TTP 

1.02+0.11 

6.1 + 1.5 

16.7 X 10* 

1.00 

G 

dCTP 

1.01±0.09 

2.03+0.68 

49.8 X 10* 

1.00 


Mutation frequencies induced by dPTP, 8-oxodGTP and thdr mixture 
In order to investigate the mutations resulting when dPTP or 8-oxodGTP was used in 
DNA synthesis reactions, PCR reactions were set up in which dPTP was added in 
eqiumolar concentrations to the four normal dNTPs. The DNA was amplified for a 
variable number of cycles aiKl, in order to eliminate the incorporated base analogues, an 
aliquot of the amplified DNA was used as template in a second PCR amplification in 
which only the four normal dNTPs were used. The PCR product was subsequently 
cloi^, and some of the clones sequenced (in this way, the pattern of mutation was not 
influenced by the DNA repair mechanisms of the £. coli host). Figure 6 shows the 
accumulation of point mutations in DNA amplified in the presence of equimolar 
concentrations of the four normal dNTPs and dPTP (□), or SoxodGTP (O ) or dPTP + 
8-oxodGTP (o). The data illustrate three points: (i) that very high mutation frequencies 
can be obtained after 30 cycles of PCR and that these freqxiencies are higher than those 
reported by other methods; (ii) that the mmiber of mutations per clone can be controlled 
by cycle number (Figure 6); and (iii) mutational yield is more than additive when a 
combination of tte two analogues is used. 

Mutation patterns generated by dPTP, 8-oxodGTP and their mixture 

Since the base pairing potential of the nucleosides dP and 8-oxodG is different (dP pairs 
with adenine or guanine - Kong & Brown Nucl. Acids Res. 1989, cited above, as 
illustrated in Figure 2) and 8-oxodG pairs with adenine or cytosine (Pavlov et al,^ cited 
above;, Kuchino et al. Nature 1987 527, 77; Monya, Proc. NaU. Acad. Sci. USA 1993 
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90, 1 122) the inventors analysed the nucleotide changes produced by the two analogue 
triphosphates and their combination. These results are shown in Figure 7 and are 
expressed as a percentage of all mutations sequenced. The figure illustrates that dPTP 
produces foiu' transitions (A-Kj, T-*C, G-*A and C-*T). Two transitions (A-*G and T-^) 
occur at higher frequency than the other two (&*A and C-^T). This results from a 
preference for insertion of dPTP opposite to adenine in the template sequence. The 
ambivalent base-pairing potential of P (as illustrated in Figure 2) results in the generation 
of transition mutations either during the incorporation of the dPTP or in its replication 
subsequent to incorporation. 8-OxodGTP produces two transversions (A-*C and T-h<j) 
resulting from the analogue being incorporated in place of TTP on either strand and 
subsequently directing the insertion of dCTP as observed previously (Pavlov et al.^ Mo 
€t al. , and Kuchino et aL , all cited above). The use of both analogues in a single DNA 
synthesis reaction results in the generation of mutations produced by the two base 
analogues with comparable frequencies. Moreover, some additional mutations (e.g. C-*<;) 
are observed. The respective types of nucleotide changes iiuluced by dPTP and 
8-oxodGTP have a consequential effect on the amino acid sequence of the mutants. 
Sequencing of 12 mutant clones obtained in the prcsciK:e of dPTP and equimolar 
concentrations of the four normal dNTPs showed that 40 out of 43 codons present in the 
test sequence were mutated to alternative ones (Fig. 8). The sequences of 8 mutant clones 
obtained in the presence of 8-oxodGTP and equimolar coiKrentrations of the fotir normal 
dNTPs showed that 18 codons were replaced (Fig. 9). It is worth noting that while both 
analogues can lead to certain point mutations, other mutations are only produced by dPTP 
and 8-oxodGTP together, demonstrating that the use of appropriate mixtures of 
triphosphate derivatives of nucleoside analogues represents a powerful procedure for the 
introduction of random mutations into DNA (Figure 11). 

Figure 1 1 shows the results of a series of mutagenesis experiments in which the following 
equimolar nucleotide mixtures were used: the four normal dNTPs and dPTP (Fig. UA); 
the four normal dNTPs and 8-oxodGTP (Fig. IIB); the four normal dNTPs, dPTP aiMi 
8-oxodGTP (Fig. HQ. Similar experiments were carried out on a second target gene 
with comparable results (data not shown). DNA amplification reactions were carried out 
for variable number of cycles, as indicated in Fig. 11 by the number preceding the point 
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in the clone designation. The data show that a significant number of point mutations are 
generated in the target gene under the three experimental conditions tested, although dPTP 
clearly proved to be a much more efficient mutagen than 8-oxodGTP. The data also 
clearly show that the number of mutations increased as a function of the ntmiber of cycles 
used for the DNA amplification reaction. When the frequency of mutations was plotted 
against the nimiber of PCR cycles (Fig. 6) a linear relation was apparent both in the case 
of 8-oxodGTP and for the mixture of dPTP and 8-oxodGTP at least up to 30 cycles* In 
the case of dPTP the relation was linear for the first 20 cycles. For low numbers of 
cycles, the combination of the two triphosphate analogues appeared to produce a total 
number of mutations lower than that produced by dPTP alone, although the DNA 
produced in such reactions contained both dP-induced and S-oxodG-induced mutations (see 
below). 

Although the clones sequenced after different numbers of PCR cycles were obtained from 
separate PCR reactions, it is interesting to note that bases at particular positions were 
mutated more frequently than others. The mutations nevertheless appeared to accumulate 
over the entire gene sequence. 

The total number of bases sequenced in the cloned inserts and the mutations generated by 
dPTP» 8-oxodGTP and their combination are listed in Table 2. The pattern of mutations 
produced by dPTP. 8-oxodGTP and their combination is shown in Figure 7. Thus of the 
mutations generated by dPTP 46.6% are A-^. 35.5% were T-<:, while G-*A were 9.2% 
and 8% were C-*T. The major mutational events (A-^G and transitions) result from 
the preferential incorporation of dPTP opposite A in either strand and subsequent pairing 
of the incorporated P with G. To a lesser extent, the iKorporation of dPTP opposite G 
in either strand aiKl subsequent pairing of P with A(G-^A and C-^T transitions) also 
occurs. In addition to the foiu: transitions mentioned above, one T-*G and two A-*! 
transversions were fouiKl out of 4093 bp sequenced (see Fig. 7). 
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Table 2 


Numbers and types of mutation produced using dFTP, 8-oxodGTP in PGR 


Mutagenic dNTP 

Bases sequenced 

Number of point mutations 



total 

coding 

silent 

dPTP 

4093 

384 

318 

66 

8-oxodGTP 

5463 

91 

65 

16 

dPTP & 8-oxodGTP 

3751 

387 

334 



In the mutants generated with 8-oxodGTP two types of transversion mutations were 
present: A-*C (38.8%) and T-K} (59%). These derive from the misincorporation of 8- 
oxodGTP opposite A in either template strand (Shibutani et al,, 1991 Nature 349, 431- 
434). One C-*A transversion was found out of 5463 bp sequenced. This mutation might 
be due to incorporation of 8-oxodGTP opposite C in the template followed by 
misincorporation of dATP opposite template 8-oxodG during subsequent replication. This 
mutagenic mechanism for 8-oxodGTP has been previously reported to occur when 8- 
oxodGTP completely subsiimtes for dGTP (Cheng et al., 1992 J. Biol. Chem. 267, 166- 
172). A very small number of additional mutations were also found: two A-^G transitions 
and one G-*A transition. 

From clones mutagenised with the combination of dPTP and 8-oxodGTP together, the 
pattern of mutations observed under these conditions is shown in Fig. 9. All types of 
transition and transversion mutations which were expected ftom the combination of the 
two triphosphate analogues were observed although their respective frequencies were 
slightly different from those predicted based on the combined frequencies of dPTP and 8- 
oxodGTP mutations. The mixture of the two analogues also increased the frequency of 
additional mutations (1 x lO"^). 

No insertions and a single two-nucleotide deletion were found using either analogue over 
a total of 13,307 bp sequenced. 
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The effects of the four transition mutations induced by dPTP and the two transversion 
mutations induced by the 8-oxodGTP were also analysed at the codon level. Figure 10 
shows the results of this analysis. The figure groups amino acids into five classes: 
glycine, non polar, polar, positively charged and negatively charged and shows the codon 
changes resulting from dPTP mutagenesis (circles), 8-oxodGTP mutagenesis (squares) and 
their combination (triangles). Codon changes resulting from a single base substitution are 
shown as full symbols, those resulting from a double substitution are shown as open 
symbols. 

In spile of the clear bias in the mutations induced by dPTP and 8-oxodGTP (Fig. 7), the 
use of these analogues or their combination allowed extensive codon changes to be 
achieved. The two genes used as model templates contained 51 out of the possible 64 
codons (codons not present in either gene are marked with an asterisk in Fig. 10). Of the 
51 codons present, 50 were mutated by dPTP or 8-oxodGTP or by their combination. 

Of 224 codon changes which were found one or more times in the database, 49 were 
silent, 66 changed the amino acid to another of the same class, 105 changed the amino 
acid to one of a different class and 4 led to termination codons. 

These results thus demonstrate that a broad spectrum of amino acid substitutions can be 
generated by dPTP and/or 8-oxodGTP mutagenesis. 

Experimental Details 

6-(2.Deoxy-/J-I>-erythropentofiiraiiosyI>-3,4Hiihydro^ 

one 5 -triphosphate, Triethylammonium salt. (dPTP) (structure 2) 

The P nucleoside (Kong Thoo Lin & Brown 1989, cited previously) - 54 mg. 0.2 mol - 
was dried in vacuo over P2O5 at 80**C overnight then suspended in dry trimethylphosphate 
(0.5 mL) under argon. The flask was cooled in an ice-bath whilst phosphoryl chloride (21 
mL) was injected with stirring. After stirring in the ice bath for 45 mins., a vortexed 
mixture (0.5 M in anhydrous DMF, 1.0 mL) of bis-tributylammonium pyrophosphate 
(Ludwig & Eckstein J. Org. Chem. 1989 54, 631), tributylamme (0.2 mL) and anhydrous 
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DMF (0.4 mL) was added with rapid stirring in ice, followed after 10 mins by 
triethylanunonium bicarbonate solution (pH 7.5, 0.1 M, 20 mL). After 1 hr, the sample 
was diluted with water (20 mL) and applied to a column of Sephadex A25 (diam. 25 x 330 
nmi) containing 0.05 M triethylammonium bicarbonate solution. The colunm was eluted 
with a linear gradient of triethylanunonium bicarbonate (1.5 L each of 0.05 - 0.8 M) at 
4*C. The 5'triphosphate of P eluted between 0.48 and 0.54 M buffer. The triphosphate- 
containing fractions were combined and evaporated aiKl the residue coevaporated with 
methanol then dissolved in water (10 mL). The product was purified further by reverse 
phase HPLC using a Waters 7.8 x 300 mm C18 semi-preparative cohinm ami a linear 
gradieni of 0-4.5 % acetonitrile in O.IM triethylammonium bicarbonate pH 7.5 with flow 
rale of 2.5 mL/min. Appropriate fractions were combined, evaporated and residual buffer 
removed by coevaporation with methanol to afford the pure triphosphate as the 
/^rraikutriethylammonium salt (253 A^^, at pH 7, 0.067 mmol, 34%). ^(D^O) - 9.57 (d. 
7-P), -10.34(d, a-P), -22.02 (t, )3-P). Approx. HPLC retention time = 18.5 min. 

2-Amino-9-(2-deoxy-/J-D-erytliropeiitofuranosyl)-6-methoxyaminopuriiie 
5'-triphosphate, Triethylammoniuni salt* (dKTF) (structure 4, where NHj) 
29.6 mg (0. 1 mol) of the K nucleoside (Brown & Kong Thoo Lin 1991 , cited previously) 
was dried in vacuo over P^Oj at 80**C overnight then suspended in dry trimethylphosphate 
(0.25 nxL) under argon. Tte flask was cooled in an icebath whilst phosphoryl chloride 
(12 fiL) was injected with stirring. After stirring in the ice batfi for 70 mins., a 
well-vortexed mixmre (see Ludwig & Eckstein, cited above) of bis-tributylammonium 
pyrophosphate (0.5 M in anhydrous DMF, 0.5 mL), tributylamine (0.1 mL) and 
anhydrous DMF (0.2 mL) was added with rapid stirring. foUowed after 7.5 mins by 
triethylammonium bicarbonate solution (pH 7.5, 0.1 M, 5 mL). After 1 hr, the sample 
was diluted with water (30 mL) and applied to a coliunn of Sephadex A25 (diam. 26 x 220 
mm) containing 0.05 M triethylammonium bicarbonate solution (pH 7.5). The column 
was eluted with a linear gradient of triethylanmionium bicarbonate (1 L each of 0.05 - 
0.8M) at 4**C. The desired 5'triphosphate of K eluted between 0.50 and 0.68 M buffer. 
The triphosphate containing fractions were combined and evaporated and the residue 
coevaporated with methanol then dissolved in water (10 mL). The product was purified 
further by reverse phase HPLC using a Waters 7.8 x 300 nun C18 semi-preparative 
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column and a linear gradient of 0-4.5 % acetonitrile in O.IM triethylammonium 
bicarbonate pH 7,5. Appropriate fractions were combined, evaporated and residual buffer 
removed by coevaporation with methanol to afford the pure triphosphate as the 
rerraA:z5triethylammonium salt (446 Ai«, at pH 7, 0.043 mmol, 43%). 6(D20) -10.37 (d, 
-10.89 (d, a-P), -23.62 (t, j3-P)- Approx. HPLC retention time - 14.7 min. 

2'*Deoxy-8-hydroxyguanosme 5 -triphosphate^ Triethylammoiiium salt (8-oxodGTP) 
(stnictore 5, where R'= NHJ 

This compound was prepared essentially according to Mo et ah (cited previously). Thus, 
dGTP (trisoditun dihydrate, 58.48 mg, 96 /xmol) in 100 mM sodium phosphate (8 mL) 
containing 30 mM ascorbic acid and 100 mM hydrogen peroxide was incubated at 37*'C 
for 4 hr. in the dark. The product was purified directly by reverse phase HPLC using a 
Waters 19 x 300 mm C18 preparative coltunn and a linear gradient of 0-15 % acetonitrile 
in O.IM triediylammonium bicarbonate pH 7.5 with a flow 7.5 mL/min. Appropriate 
fractions were combing, evaporated and residual buffer removed by coevaporation with 
methanol to afford the pure triphosphate as the rermfcmriethylammoniimi salt. The 
absorbance spectrum was identical to that described by Mo et al,, (cited previously) and 
by WaUace et al., (Nucl. Acids Res. 1994 22, 3930) - (123 A244. 10.3 A^^ at pH 7, 5.2 
Mmol, 5.4 %). 6(D20) - 9.68 (d, 7-P), -10.46 (d, a-P), -22.40 (t, ^-P). Approx. HPLC 
retention time = 27.9 min, dGTP 26,0 min. 

The foregoing section comprises a detailed discussion as to how a particular novel 
compound (dPTP), withm the scope of tte invention may be synthesised. It will be clear 
to those skilled in the art, with the benefit of the disclosure contained herein, how other 
compotmds within the scope of the invention may be made. For example, to prepare the 
ribonucleotide equivalent (rP) of dP, essentially the same synthetic route could be 
employed (using appropriate starting materials), using triacetylribofuranosyl chloride (for 
improved solubility) instead of the di-p-toluoyl 2-deoxyribosyl chloride compound 
described above. 

Mutagenesis 

For mutagenesis experiments, 10 frnoles of template DNA were amplified using 0.5 ^1 of 
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AmpliTaq polymerase (S U/fil, Applied Biosystems) in a 20 /xl reaction containing the 
appropriate sense and antisense polymers at 0.5 /iM, 2mM MgCl2. 10 mM Tris-HCl 
pH8.3, 50 mM KCK 1 g/1 gelatine and dATP, dCTP, dGTP, TTP, dPTP aiKl/or 8- 
oxodGTP, each at 500 pM, After various cycles (92*^0 for 1 min, 55*^0 for 1.5 min, 
75 °C for 5 min), IpA of the amplified material was used in a second PGR in which the 
same conditions as above were used except that no dPTP or 8*oxodGTP were added to the 
reaction mixture. The product of the second PGR was digested with BstEH and Pstl and 
cloned into M13VHPCR1 vector (Jones et aL, 1986 Nature 327, 522-525). Sequence 
analysis of single stranded DNA prepared from single phage isolates was performed using 
Sequenase Version 2 (United States Biochemicals» Gleveland, OH) according to the 
manufacturer. 

Example 2 

Random mutagenesis and selection of an enzyme with improved catalytic activity 

In order to investigate the potential of the mutagenesis method in experiments of in vitro 
directed molecular evolution, the enzyme TEM-1 /^-lactamase was used as a model system. 

jS-lactamases are responsible for bacterial resistance to i^-lactam antibiotics such as 
ampicillins and cephalosporins by catalysing the hydrolysis of the /3-lactam ring and 
generating an inactive product. TEM-1 is a particularly attractive model system because 
a very efficient chemical selection for improved function can be applied. Thus the model 
allows us the assessment of the potential of the mutagenesis method per se, without the 
possible limitations due to ii^ufficient resolution of the screening/selection technique. 

In this experiment we set to improve the hydrolytic activity of the enzyme TEM-1 /3- 
lactamase on the poorly hydrolysed substrate cefotaxime [minimum inhibitory 
concentration (MIC) = 0.02 iig/nd] by repeated roimds of random mutagenesis aznl 
selection on increasing concentrations of the antibiotic. The best mutants selected in the 
first round are subjected to a second round of mutagenesis and selection in the presence 
of higher concentration of antibiotic. A stepwise improvement of the efficiency of TEM- 1 
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hydrolytic activity is attained by progressively iiKreasing the selective pressure. 

The wild type TEM-1 gene from the plasmid pBR322 was used as a template for PCT 
amplification in the presence of dPTP and 8-oxodGTP in addition to the four normal 
dNTPs, The pool of mutants generated was cloned in the vector pBC KS"^ and the library 
of mutants transformed in £. coli by eiectroporation. The transformed bacteria were then 
plated on increasing concenn^tions of the antibiotic cefotaxime. Bacteria growing on a 
concentration of cefotaxime higher than the MIC of 0.02 /xg/ml cany a mutant of TEM-1 
with improved hydrolytic activity on this substrate. Selected mutants were analysed by 
sequencing and the results of the first round of mutagenesis and selection are shown in 
Table 3 A. A different niraiber of cycles of the mutagenic PCT was used to generate four 
iiKlependent libraries of mutants (libA, libB, libC and libD) each characterised by a 
different frequency of mutation (0.3% for libA, 0.3% for libB, 1.8% for libC and 6.3% 
for libD), as determined by sequence analysis of unselected clones. An aliquot of each 
library ( — 5x10* cfu) was plated on 0.2 /ig/ml cefcRaxime (10 times the MIC) and 
inspection of the plates after incubation at 37 ^'C for 24h revealed several colonies (>S0 
colonies/plate). Table 3A shows the results of the sequence analysis of selected clones 
growing on these plates. Asterisks indicate silent muutions; aminoacid numbering is 
according to Ambler et al., (BiocherrL 1991 266, 3186). Underlined residues belong 
to the leader peptide. Most of the selected clones contain multiple nucleotide substimtions 
generating both silent and coding mutations. Interestingly, mutations at particular 
positions (see for example L21P, V23A. G238S, E240G etc.) were found several times 
in independently generated libraries. 

The high nimiber of colonies growing on 0.2 fig/nH cefotaxime pnmipted us to look for 
mutants able to grow on even higher concentration of antibiotic. All colonies were 
scraped off the plates, dissolved in broth and an aliquot &om each library was plated on 
medium containing 2/ig/ml cefotaxime. Plates were incubated at 37*'C for 24 h and then 
inspected for colony growth. Selected clones were sequenced and the results are shown 
in Table 3B. All the mutants sequenced, except one, contained a G238S mutation. The 
only exception is a cloi^ containing an R164S mutation. Both these mutations have been 
found in imtural isolates of TEM*1 jS-lactamase showing improved hydrolytic activity on 
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cefotaxime. Clone 6a contains R24 1 H and D252G mutations in addition to G238S. When 
its enzymatic activity was compared with that of the single mutant G238S it appeared to 
be at least ten times higher, with colonies growing on >20 ^g/ml. 

These preliminary results indicate that random mutagenesis of DNA by PCT using 
triphosphate analogues is an effective way to generate large pools of protein mutants 
among which it is possible to select variants showing improved performance. The method 
we propose generates very efficiently large numbers of mutants and allows control over 
the frequency of mutation. As a consequence, it was possible to select enzyme mutants 
with improved catalytic activity by screening a relatively small number of variants ( - 10*), 
well within the average library size. Moreover, the possibility to control the frequency 
of mutation and to introduce, on average, more than one nucleotide substimtion per gene, 
allowed us to isolate in a single step of mutagenesis and selection a triple mutant in which 
the mutations appear to have cooperative effect on the efficiency of TEM-1 hydrolytic 
activity. 

Discussion 

Further developments of the approach described herein are envisaged. Firstly, 
modification of dPTP to produce a closely related analogue which displays a tautomeric 
constant closer to unity wcmld adjust the balance between all four possible transition 
mutations. The second concerns the ratio of transition versus transversion mutations in 
experiments in which both the dPTP (or related) analogue and 8-oxodGTP arc used in 
combination. In the experiments reported here, both analogues were used at the 
concentration of SOO /iM but the higher rate of incorporation and/or extension of dP led 
to a higher frequency of dP-induced mutations. It should be possible to obtain comparable 
numbers of transitions and transversions from mutagenesis reactions in which the 
concentrations of dPTP and 8-oxodGTP arc adjusted in order to compensate for their 
different kinetics. Finally, it is clear that six transversion mutations (C-Kj, G-*T, T-*A, 
A-*T, C-*A and G-KT) either arc not produced by the dNTP mixture, or else they are 
produced at very low frequencies. Other analogues therefore, such as O^-ethylthymidine 
triphosphate which induces A-*T transversions, albeit at a low frequency, (Singer et al. 
1989) could be used in order to extend the range of transversion mutations. 
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While in vitro point mutagenesis followed by selection clearly aims to mimic an important 
aspect of protein evolution^ it is clear that nature's strategy of protein engineering equally 
relies on a variety of other processes such as gene insertions, deletions, duplication and 
recombination. Procedures are being developed which aim to reproduce these events in 
vitro and harness their potential for protein engineering. In one such procedure, gene 
Augments obtained by DNasel treatment are reassembled by PCR in a process that 
promotes random recombination (Stemmer, 1994 Proc. Natl. Acad. Sci, USA 9i» 10747- 
10751). The effectiveness of this approach has been clearly illustrated by its application 
in the engineering of /3-lactamase mutants, one of which, when expressed in £. co/t\ 
showed a 32,000 fold increase in minimtmi inhibitory concentradon compared to wild-type 
enzyme (Stenmier 1994 Nature 570, 389-391). It is of interest, however, that the 
sequence of the improved mutant only contained S point mutations compared to the wild- 
type enzyme. This shows that Stemmer's protocol is accompanied by an appreciable rate 
of point mutagenesis and that, at least in the /3-Iactamase example, such point mutations 
are entirely responsible for the maturation of the enzyme in the absence of bona fide 
recombination. The results, nevertheless, reinforce the concept that point mutagenesis is 
a powerful approach for protein engineering in vitro and suggest that recombination 
coupled with point mutagenesis may have a special potential for engineering new proteins 
from series of homologous genes. 

Previous DNA muugenesis protocols typically resulted in relatively low mutational rates. 
The procedure described here, however, can lead to a frequency of nucleotide substimdons 
approaching 1 in 5 after 30 cycles of DNA amplification. This clearly raises the issue of 
an optimal mutational load for protein engineering. 

It seems reasonable to suggest that the lower limit of an efficient random mutagenesis 
protocol may aim at introducing, on average, one amino acid change per sequeiK:e but this 
may well be sub-optimal. While a very large number of simultaneous substimdons would 
clearly destroy protein stability, smdies with several model systems suggest that relatively 
few amino acids positions are critical for function and stability. 


In T4 lysozyme, for example, substimtion of each amino acid (except for the initiator 
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methionine) with 13 different amino acids has shown that more than half the positions 
tolerated all substitutions (Rennell et al., 1991 J. MoL Biol. 222, 67-87). Furthermore, 
out of 2015 mutations, only 173 were seriously deleterious and these were confined to 53 
out of 163 positions (Rennell et al., 1991). Smdies on the X repressor also demonstrated 
the fact that numerous substitutions in the core of the protein are tolerated (Lim et aL, 
1991 J. Mol, Biol. 2/9, 359-376). 

Although these studies do not address directly the effect of multiple raiKiom mutations, 
they suggest, nevertheless, that these would not invariably result in the loss of protein 
function, an argument reinforced by the residts of studies on somatic hypermmation of 
antibody genes. Antibodies isolated in secondary or tertiary responses contain a 
considerable number of replacement mutations (see Berek & Milstein 1987 Immunol. Rev. 
96, 23-41). In cases in which the role of individual substimtions has been analysed, it 
appeared that only a few mutations played a role in affinity mamration (for example 3 out 
of 19 amino acids in the anti-p-azophenylarsonate antibody) (Sharon 1990 Proc. Natl. 
Acad. Sci. USA 57, 4814-4817), yet the VH and VL domains appear to tolerate 
substimtion rates approaching 1 in 10. 

The procedure described here may allow the optimal mutatioiial load for protein 
engineering to be addressed experimentally since this can now be readily controlled and 
libraries of protein mutants carrying differem numbers of substimtions can be constructed. 
These studies should assist exploring the potential of the mutagenesis/selection approach 
for protein engineering. 

Finally, the relationship of tte present invention to combinatorial oligonucleotide chemistry 
should be mentioned. In the latter, a wide variety of short (n) repertoires, generally of 
large sequence content (e.g. 4") can be synthesised. Essentially all possible sequence 
isomers are formed and effective rotmds of selection have to be applied in a variety of 
formats to identify the sequence of interest. Typically, in the present approach an already 
functional DNA sequeiK:e is amplified under a variable mutational pressure and the 
products then cloned. The mutational frequencies observed, as related to PCR cycle 
number, were derived from a few tens of randomly-picked colonies. These mutation 
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tequencies presumably hold for all the clones carrying the insert. Thus the number of 
of mutant inserts sequenced represents a very small fraction of the total formed in each 
amplification process. Routine selection methods akin to those used with the large 
synthetic repertoires should demonstrate the applicability of the present invention to the 
problems discussed above. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Medical Research Council 

(B) STREET: 20 Park Crescent 

(C) CITY: London 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP): WIN 4AL 

(G) TELEPHONE: (0171) 636 5422 

(H) TELEFAX: (0171) 323 1331 

(i1) TITLE OF INVENTION: Improvements in or Relating to ftjtagenesis 

of Nucleic Acids 

(iii) NUMBER OF SEQUENCES: 5 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0. Version #1.30 (EPO) 


(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 285 base pairs 

(B) TirPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GAGTCTGGAG GAGGCTTGAT CCAGCCTGGG GGGTCCCTGA GACTCTCCT6 TGCAGCQCT 60 

GGGHCACCG TCAGTAGCAA CTATATGAGC TGGGTCCGCC AGGCTCCAGG GAAGGGGCTG 120 

GAGTGGGTCT CAGTTATTTA TAGCGGTGGT AGCACATACT ACGCAGACTC CGTGAAGGGC 180 

CGAHCACCA TCTCCAGAGA CAAHCCAAG AACACGCTGT ATQGCAAAT GAACAGCCTG 240 

AGAGCTGAGG ACACGGCCGT GTATTACTGT GCAAGAAAGT TTCCT 285 


(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Glu Ser Gly Gly Gly Leu lie Gin Pro Gly Gly Ser Leu Arg Leu Ser 
15 10 15 

Cys Ala Ala Ser Gly Phe Thr Val Ser Ser Asn Tyr Met Ser Trp Val 
20 25 30 

Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Val He Tyr Ser 
35 40 45 

Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr He 
50 55 60 

Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu 
65 70 75 80 

Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Lys Phe Pro 
85 90 95 


(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGCCnCATA TTCACAAACG AAT 


(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TCTTACCATT CGTHGTGAA TATCAAGGCC 


(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
(8) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TCTTGCCAn CGTTTGTGAA TATCAAGGCC 
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CLAIMS 

1 . A compound having the strucmre : 



where X* = O, N-alkyl, N*-dialkyU or N-benzyl; = triphosphate (PjO,)^, 
diphosphate (PzO^^ , thiotriphosphate (P^O^)^ » or analogues thereof, but not H; and 
= H, NH2, F or OR, where R may be any group, but is preferably H, methyl, allyl 
or aUcaryl. 

2. A compound according to claim 1, wherein X' = O. 

3. A compound dPTP according to claim 2. wherein, X^ = triphosphate, and X^ = H 
or OH. 

4. A method of mutating a nucleic acid sequence, comprising replicating a template 
sequence in the presence of a nucleoside triphosphate analogue in accordance with any 
one of claims 1, 2 or 3, so as to form non-identical copies of the template sequence 
comprising one or more nucleoside phosphate analogue residues. 

5. A method according to claim 4, comprising replicating a template sequence in the 
presence of deoxyP triphosphate (dPTP) or a functional equivalent thereof, so as to 
form non-identical copies of the template sequence comprising one or more dP 
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6. A method according to claim 4 or 5, wherein die template sequence is replicated in 
the presence of one or more additional nucleoside triphosphate analogues. 

7. A medKxl according to any one of claims 4, 5 or 6, wherein the template sequence 
is replicated in the presence of a compound having the structure: 



OH Y* 


where Y* = OH, O-alkyl, NH^ or N(AIkyl)2; = H, or NH^; = triphosphate 
(PjO,)*-, diphosphate (PiO^^, thiotriphosphate (P^OgS)^, or analogues thereof, but not 
H; and Y* = H, NHj, F, or OR where R may be any group but is preferably H, 
methyl, ally! or allcaryl. 

8. A method according to any one of claims 4, S or 6, wherein the template sequence 
is replicated in the presence of S^xodGTP, and/or dKTP, and/or O^-ethylthymidine 
tripho^hate. 

9. A method according to any one of claims 4-8, conq)rising the furtter step of 
replicating the non-identical copies of the template sequence in the presence of the four 
normal dNTPs, but in the absence of analogues thereof, to form further non-identical 
copies of the template sequence comprising only the four normal deoxynucleotides. 
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10. A method according to any one of clauns 4-9. wherein the replication of the 
template sequeiK:e, and/or the replication of the non-identical copies thereof, is 
achieved by means of PCR. 

11. A method according to any one of claims 4-10, wherein the template sequence is 
replicated in the additional presence of the four normal deoxynucleotides. 

12. A method according to any one of claims 4-11, wherein the template sequence is 
replicated in the presence of 1/iM to 600/iM dPTP. 

13. A method according to any one of claims 4-12, wherein the template sequence is 
replicated in the presence of l^iM to 600^ 8-oxodGTP. 

14. A kit for performing the method of any oi^ of claims 4-13, comprising a 
nucleoside triphosphate analogue in accordance with any one of claims 1, 2 or 3, 
means for replicating a tenqplate sequence so as to incorporate the nucleoside phosphate 
analogue into nonridentical copies of the template sequence, and instructions for use 
according to the method of any one of claims 4-13. 

15. A kit according to claim 14, wherein the nucleoside triphosphate analogue is dPTP. 

16. A kit according to claim 14 or 15, wherein the means for replicating the template 
sequence comprises means for performing the polymerase chain reaction. 

17. A kit according to any one of claims 14, 15 or 16, finther comprising the four 
normal deoxynucleotides. 

18. A kit according to any one of claims 14-17, further comprising 8-oxodGTP and/or 
dKTP, and/or O^-ethylthymidinc triphosphate. 


19. A compouiKl having the structure: 
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OH 


where Y» = OH, 0-alkyK NH^ or N(AIkyl),; = H, or NH^; = triphosphate 
(PjO,)^. diphosphate (PjOs)^. thiotriphosphate (PaOgS)^, or analogues thereof, but not 
H; and Y* = H, NHj, F, or OR where R may be any group but is preferably H, 
methyl, allyl or alkaryi. 

20. A compound according to claim 19, wherein Y^ = OCH3; Y^ = triphosphate; and 
Y* = H or OH. 

21. A method of mutating a nucleic acid sequence, comprising replicating a template 
sequence in the presence of a nucleoside triphosphate analogue according to claim 19 
or 20, so as to form non-identical copies of the template sequence comprising one or 
more nucleoside phosphate analogues. 

22. A method according to claim 21, and in accordance with any one of claims 4*13. 

23. A method of making a DNA sequence in vitro ^ the method comprising treating in 
appropriate coiKiitions a mixture comprising the four normal dNTPs aiKi a nucleoside 
triphosphate analogue according to claims 1-3 or claims 19-20, with a DNA 
polymerase in the presence of a template strand of nucleic acid, so as to form a 
sequence of nucleotides comprising at least one analogue. 


24. A method of making an RNA sequence in vitro, the method comprising treating in 
appropriate conditions a mixture comprising the four normal rNTPs and a 
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ribonucleoside triphosphate analogue according to claims 1-3 or claims 19-20, with an 
RNA polymerase in the presence of a template strand of nucleic acid, so as to form a 
sequence of ribonucleotides comprising at least one analogue. 
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